A comparison of pivot selection techniques for permutation-based indexing

نویسندگان

  • Giuseppe Amato
  • Andrea Esuli
  • Fabrizio Falchi
چکیده

Recently, permutation based indexes have attracted interest in the area of similarity search. The basic idea of permutation based indexes is that data objects are represented as appropriately generated permutations of a set of pivots (or reference objects). Similarity queries are executed by searching for data objects whose permutation representation is similar to that of the query, following the assumption that similar objects are represented by similar permutations of the pivots. In the context of permutation-based indexing, most authors propose to select pivots randomly from the data set, given that traditional pivot selection techniques do not reveal better performance. However, to the best of our knowledge, no rigorous comparison has been performed yet. In this paper we compare five pivot selection techniques on three permutation-based similarity access methods. Among those, we propose a novel technique specifically designed for permutations. Two significant observations emerge from our tests. First, random selection is always outperformed by at least one of the tested techniques. Second, there is not a technique that is universally the best for all permutation-based access methods; rather different techniques are optimal for different methods. This indicates that the pivot selection technique should be considered as an integrating and relevant part of any permutation-based access method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pivot Selection Strategies for Permutation-Based Similarity Search

Recently, permutation based indexes have attracted interest in the area of similarity search. The basic idea of permutation based indexes is that data objects are represented as appropriately generated permutations of a set of pivots (or reference objects). Similarity queries are executed by searching for data objects whose permutation representation is similar to that of the query. This, of co...

متن کامل

Pivot-based Metric Indexing

The general notion of a metric space encompasses a diverse range of data types and accompanying similarity measures. Hence, metric search plays an important role in a wide range of settings, including multimedia retrieval, data mining, and data integration. With the aim of accelerating metric search, a collection of pivotbased indexing techniques for metric data has been proposed, which reduces...

متن کامل

Deep Permutations: Deep Convolutional Neural Networks and Permutation-Based Indexing

The activation of the Deep Convolutional Neural Networks hidden layers can be successfully used as features, often referred as Deep Features, in generic visual similarity search tasks. Recently scientists have shown that permutation-based methods offer very good performance in indexing and supporting approximate similarity search on large database of objects. Permutation-based approaches repres...

متن کامل

Pivot Selection Methods Based on Covariance and Correlation for Metric-space Indexing

Metric-space indexing is a general method for similarity queries of complex data. The quality of the index tree is a critical factor of the query performance. Bulkloading a metricspace indexing tree can be represented by two recursive steps, pivot selection and data partition, while pivot selection dominants the quality of the index tree. Two heuristics, based on covariance and correlation, for...

متن کامل

A New Efficient Optimal 2D Views Selection Method Based on Pivot Selection Techniques for 3D Indexing and Retrieval

In this paper, we propose a new method for 2D/3D object indexing and retrieval. The principle consists of an automatic selection of optimal views by using an incremental algorithm based on pivot selection techniques for proximity searching in metric spaces. The selected views are afterward described by four well-established descriptors from the MPEG-7 standard, namely: the color structure descr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Syst.

دوره 52  شماره 

صفحات  -

تاریخ انتشار 2015